pdb_stats <- read.csv("./Data Export Summary.csv", row.names=1)
pdb_stats
## X.ray NMR EM Multiple.methods Neutron Other Total
## Protein (only) 142419 11807 6038 177 70 32 160543
## Protein/Oligosaccharide 8426 31 991 5 0 0 9453
## Protein/NA 7498 274 2000 3 0 0 9775
## Nucleic acid (only) 2368 1378 60 8 2 1 3817
## Other 149 31 3 0 0 0 183
## Oligosaccharide (only) 11 6 0 1 0 4 22
Q1: What percentage of structures in the PDB are solved by X-Ray and Electron Microscopy.
X ray: 87.53% Electron Microscopy: 4.95%
# Find percentages separately
sum(pdb_stats$X.ray)/ sum(pdb_stats$Total)
## [1] 0.8752836
sum(pdb_stats$EM)/ sum(pdb_stats$Total)
## [1] 0.0494687
# Complete across all columns (i.e. all structural types)
round(((colSums(pdb_stats)/sum(pdb_stats$Total)) *100), 2)
## X.ray NMR EM Multiple.methods
## 87.53 7.36 4.95 0.11
## Neutron Other Total
## 0.04 0.02 100.00
Q2: What proportion of structures in the PDB are protein?
87.35%
round(((pdb_stats$Total[1]/sum(pdb_stats$Total))*100), 2)
## [1] 87.35
Q3: Type HIV in the PDB website search box on the home page and determine how many HIV-1 protease structures are in the current PDB?
23409
Import protein structure